Recombinant butyrylcholinesterases and truncates thereof

ABSTRACT

Isolated nucleic acids encoding polypeptides that exhibit butyrylcholinesterase (BChE) enzyme activity are disclosed, along with molecular criteria for preparing such nucleic acids, including codon optimization. Methods of preparing modified and/or truncated BChE molecules having selected properties, especially selective formation of monomers, are also described. Vectors and cells containing and/or expressing the nucleic acids are also disclosed.

This application claims priority of U.S. provisional Application61/284,444, filed 21 Dec. 2009, the disclosure of which is hereinincorporated by reference in its entirety.

FIELD OF THE INVENTION

The present invention provides methods for the production of recombinantbutyrylcholinesterases using polynucleotides codon-optimized forexpression in mammalian, especially human, cells, including truncatesthereof.

BACKGROUND OF THE INVENTION

The general term cholinesterase (ChE) refers to a family of enzymesinvolved in nerve impulse transmission. Cholinesterase-inhibitingsubstances such as organophosphate compounds or carbamate insecticidesor drugs prevent the breakdown of acetylcholine, resulting in a buildupof acetylcholine, thereby causing hyperactivity of the nervous system.When humans breathe or are otherwise exposed to these compounds, whichhas led to the development of these compounds as “nerve gases” orchemical warfare agents.

Those enzymes which preferentially hydrolyze other types of esters suchas butyrylcholine, and whose enzymatic activity is sensitive to thechemical inhibitor tetraisopropylpyrophosphoramide (also known asiso-OMPA), are called butyrylcholinesterases (BChE, EC 3.1.1.8).

Butyrylcholinesterase (BChE), also known as plasma, serum, benzoyl,false, or Type II ChE, has more than eleven isoenzyme variants andpreferentially uses butyrylcholine and benzoylcholine as in vitrosubstrates. BChE is found in mammalian blood plasma, liver, pancreas,intestinal mucosa, the white matter of the central nervous system,smooth muscle, and heart. BChE is sometimes referred to as serumcholinesterase as opposed to red cell cholinesterase (AChE).

The use of cholinesterases as pre-treatment drugs has been successfullydemonstrated in animals, including non-human primates. For example,pretreatment of rhesus monkeys with fetal bovine serum-derived AChE orhorse serum-derived BChE protected them against a challenge of two tofive times the LD50 of pinacolyl methylphosphonofluoridate (soman), ahighly toxic organophophate compound used as a war-gas [Broomfield, etal. J. Pharmacol. Exp. Ther. (1991) 259:633-638; Wolfe, et al. ToxicolAppl Pharmacol (1992) 117(2):189-193]. In addition to preventinglethality, the pretreatment prevented behavioral incapacitation afterthe soman challenge, as measured by the serial probe recognition task orthe equilibrium platform performance task. Administration of sufficientexogenous human BChE can protect mice, rats, and monkeys from multiplelethal-dose organophosphate intoxication [see for example Raveh, et al.Biochemical Pharmacology (1993) 42:2465-2474; Raveh, et al. Toxicol.Appl. Pharmacol. (1997) 145:43-53; Alton, et al. Toxicol. Sci. (1998)43:121-128]. Purified human BChE has been used to treat organophosphatepoisoning in humans, with no significant adverse immunological orpsychological effects (Cascio, et al. Minerva Anestesiol (1998) 54:337).

In addition to its efficacy in hydrolyzing organophosphate toxins, thereis strong evidence that BChE is the major detoxifying enzyme of cocaine[Xie, et al. Molec. Pharmacol. (1999) 55:83-91]. Cocaine is metabolizedby three major routes: hydrolysis by BChE to form ecgonine methyl ester,N-demethylation from norcocaine, and non-enzymatic hydrolysis to formbenzoylcholine. Studies have shown a direct correlation between low BChElevels and episodes of life-threatening cocaine toxicity. A recent studyhas confirmed that a decrease of cocaine half-life in vitro correlatedwith the addition of purified human BChE.

In view of the significant pharmaceutical potential of ChE enzymes,research has focused on development of recombinant methods to producethem. Recombinant enzymes, as opposed to those derived from plasma, havea much lower risk of transmission of infectious agents, includingviruses such as hepatitis C and HIV.

The cDNA sequences have been cloned for both human AChE (see U.S. Pat.No. 5,595,903) and human BChE [see U.S. Pat. No. 5,215,909 to Soreq;Prody, et al. Proc. Natl. Acad. Sci. USA (1987) 84:3555-3559; McTiernan,et al. Proc. Natl. Acad. Sci USA (1987) 84:6682-6686]. The amino acidsequence of wild-type human BChE, as well as of several BChE variantswith single amino acid changes, is set forth in U.S. Pat. No. 6,001,625.

Notably, none of the recombinant expression systems reported to datehave the ability to produce BChE in quantities sufficient to allowdevelopment of the enzyme as a drug to treat such conditions asorganophosphate poisoning, post-surgical apnea, or cocaine intoxication.However, an additional problem is longevity. Thus, the longer the BChEremains in the system of a person treated, the longer it is availablefor detoxification. Such lifespan is referred to as the “mean residencetime” (MRT) in the system.

The current state of art for BChE is directed to making the tetramerform because it is the “native form” and is thus considered to be morestable with a longer “mean residence time” (MRT). However, due to thevery large size of the tetramer, it is difficult to prepare. Inaddition, such preparation usually results in a mixture of tetramer,dimer and monomer forms with low yield. Such preparation has proven bothvery cumbersome and very expensive to purify and characterize. As aresult, it is probably too expensive to make as a useful therapeuticproduct. In view of the foregoing, more powerful methods of producingBChE are needed.

In sum, the current obstacles in the manufacture of the native BChEmolecule as a bioscavenger product which are: 1) low yield, 2) complexmanufacturing process (milk), 3) short half-life (thus requiringpegylation), 4) highly heterogeneous product (difficult to characterizeand obtain FDA approval) and 5) high cost of the product.

The present invention addresses at least some of these problems byproviding inter alia a truncated monomeric form of BChE. While the themonomer form is just as active as the tetrameric form, it has beenconsidered to be less stable (i.e., have a lower “MRT”) than thetetramer. This may be because the protein made is not properlyglycosylated and/or sialylated. Applicants have identified a cell lineand clone to accomplish this result. Furthermore, if the full lengthBChE is made, the cells produce a mixture of monomer, dimer and tetramerso that the present invention also provides a means of producingpreferably the monomeric form.

BRIEF SUMMARY OF THE INVENTION

In one aspect, the present invention relates to an isolated nucleicacid, which may be DNA, such as a cDNA, or RNA, that encodes apolypeptide having BChE enzyme activity (as determined, for example,using the well known Ellman assay), wherein the nucleic acid has beencodon-optimized, such as where the percentage of guanine plus cytosine(G+C) nucleotides in the coding region of the nucleic acid is greaterthan about 40%, or is greater than 45%, or is greater than 50%, or isgreater than 55%, or is at least 60%, or is greater than 60% but notgreater than 80%.

In specific embodiments, the isolated nucleic acid does not containinternal structural elements that reduce expression levels of thesubject genetic construct, including an internal TATA-box, an internalribosomal entry site, or a splice donor or acceptor site.

In one embodiment, the isolated nucleic acid of the invention containsor encodes at least one Kozak sequence, preferably upstream of the startsite.

The isolated nucleic acid of the invention also encodes one or moreglycosylation and/or sialylation sites on the synthesized polypeptide.In a preferred embodiment, these are sufficient in number to permit fullglycosylation and/or sialylation of the encoded BChE.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1-1 to 1-4 show codon optimized nucleotide (SEQ ID NO: 1) andcorresponding amino acid (SEQ ID NO: 2) sequences of a BChE of theinvention. Such sequences contain inserted restriction and other sites,such as the human signal or leader sequence made up of amino acids 1 to21, where full length BChE begins with the sequence EDD starting atamino acid residue 22.

FIGS. 2-1 to 2-2 show nucleotide (SEQ ID NO: 5) and corresponding aminoacid (SEQ ID NO: 6) sequences of native human BChE (i.e., without codonoptimization), containing a goat casein leader or signal polypeptide(amino acids 1 to 15) for transgenic expression in goat's milk, andwhere the BChE native polypeptide begins with the sequence EDD startingat amino acid residue 17 (the glutamic acid numbered 1) and insertedarginine as 1′).

FIGS. 3-1 to 3-4 show codon-optimized nucleotide (SEQ ID NO: 3) andamino acid sequence (SEQ ID NO: 4) of a BChE truncate of the invention,containing a human signal or leader sequence made up of amino acids 1 to21, where the BChE truncate begins with the sequence EDD starting atamino acid residue 22.

DEFINITIONS

Unless expressly stated otherwise elsewhere herein, each of thefollowing terms has the stated meaning:

The term “butyrylcholinesterase enzyme” or “BChE enzyme” means apolypeptide capable of hydrolyzing acetylcholine and butyrylcholine, andwhose catalytic activity is inhibited by the chemical inhibitoriso-OMPA. Preferred BChE enzymes to be produced by the present inventionare mammalian BChE enzymes. The term “BChE enzyme” also encompassespharmaceutically acceptable salts of such a polypeptide.

The term “recombinant butyrylcholinesterase” or “recombinant BChE” meansa BChE enzyme produced by a transiently transfected, stably transfected,or transgenic host cell or animal as directed by one of the expressionconstructs of the invention as well as by direct chemical synthesis. Theterm “recombinant BChE” also encompasses pharmaceutically acceptablesalts of such a polypeptide.

The term “vector sequences” means any of several nucleic acid sequencesestablished in the art which have utility in the recombinant DNAtechnologies of the invention to facilitate the cloning and/orpropagation of the expression constructs including (but not limited to)plasmids, cosmids, phage vectors, viral vectors, and yeast artificialchromosomes.

The term “expression construct” or “construct” means a nucleic acidsequence comprising a target nucleic acid sequence or sequences whoseexpression is desired, operably linked to sequence elements whichprovide for the proper transcription and translation of the targetnucleic acid sequence(s) within the chosen host cells. Such sequenceelements may include a promoter, a signal sequence for secretion, apolyadenylation signal, intronic sequences, insulator sequences, andother elements described in the invention. The “expression construct” or“construct” may further comprise vector sequences.

The term “operably linked” means that a target nucleic acid sequence andone or more regulatory sequences (e.g., promoter or signal sequences)are physically linked so as to permit expression of the polypeptideencoded by the target nucleic acid sequence within a host cell.

The term “promoter” means a region of DNA involved in binding of RNApolymerase to initiate transcription.

The term “signal sequence” means a nucleic acid sequence which, whenincorporated into a nucleic acid sequence encoding a polypeptide,directs secretion of the translated polypeptide (e.g., a BChE enzymeand/or a glycosyltransferase) from cells which express said polypeptide.The signal sequence is preferably located at the 5′-end of the nucleicacid sequence encoding the polypeptide, such that the polypeptidesequence encoded by the signal sequence is located at the N-terminus ofthe translated polypeptide, and is commonly a leader sequence. The term“signal peptide” means the peptide sequence resulting from translationof a signal sequence.

The term “host cell” means a cell which has been transfected with one ormore expression constructs of the invention. Such host cells includemammalian cells in in vitro culture and cells found in vivo in ananimal. Preferred in vitro cultured mammalian host cells include Per.C6cells.

The term “transfection” means the process of introducing one or more ofthe expression constructs of the invention into a host cell by any ofthe methods well established in the art, including (but not limited to)microinjection, electroporation, liposome-mediated transfection, calciumphosphate-mediated transfection, or virus-mediated transfection. A hostcell into which an expression construct of the invention has beenintroduced by transfection is “transfected”. The term “transientlytransfected cell” means a host cell wherein the introduced expressionconstruct is not permanently integrated into the genome of the host cellor its progeny, and therefore may be eliminated from the host cell orits progeny over time. The term “stably transfected cell” means a hostcell wherein the introduced expression construct has integrated into thegenome of the host cell and its progeny.

In accordance with the present invention, the term “DNA segment” refersto a DNA polymer, in the form of a separate fragment or as a componentof a larger DNA construct, which has been derived from DNA isolated atleast once in substantially pure form, i.e., free of contaminatingendogenous materials and in a quantity or concentration enablingidentification, manipulation, and recovery of the segment and itscomponent nucleotide sequences by standard biochemical methods, forexample, using a cloning vector. Such segments are provided in the formof an open reading frame uninterrupted by internal non-translatedsequences, or introns, which are typically present in eukaryotic genes.Sequences of non-translated DNA may be present downstream from the openreading frame, where the same do not interfere with manipulation orexpression of the coding regions.

“Isolated” in the context of the present invention with respect topolypeptides (or polynucleotides) means that the material is removedfrom its original environment (e.g., the natural environment if it isnaturally occurring). For example, a naturally-occurring polynucleotideor polypeptide present in a living organism is not isolated, but thesame polynucleotide or polypeptide, separated from some or all of theco-existing materials in the natural system, is isolated. Suchpolynucleotides could be part of a vector and/or such polynucleotides orpolypeptides could be part of a composition, and still be isolated inthat such vector or composition is not part of its natural environment.The polypeptides and polynucleotides of the present invention arepreferably provided in an isolated form, and preferably are purified tohomogeneity.

The term “coding region” refers to that portion of a gene which eithernaturally or normally codes for the expression product of that gene inits natural genomic environment, i.e., the region coding in vivo for thenative expression product of the gene. The coding region can be from anormal, mutated or altered gene, or can even be from a DNA sequence, orgene, wholly synthesized in the laboratory using methods well known tothose of skill in the art of DNA synthesis.

In accordance with the present invention, the term “nucleotide sequence”refers to a heteropolymer of deoxyribonucleotides. Generally, DNAsegments encoding the proteins provided by this invention are assembledfrom cDNA fragments and short oligonucleotide linkers, or from a seriesof oligonucleotides, to provide a synthetic gene which is capable ofbeing expressed in a recombinant transcriptional unit comprisingregulatory elements derived from a microbial or viral operon.

The term “expression product” means that polypeptide or protein that isthe natural translation product of the gene and any nucleic acidsequence coding equivalents resulting from genetic code degeneracy andthus coding for the same amino acid(s).

As used herein, the terms “portion,” “segment,” “truncate” and“fragment,” when used in relation to polypeptides, refer to a continuoussequence of residues, such as amino acid residues, which sequence formsa subset of a larger sequence. For example, if a polypeptide weresubjected to treatment with any of the common endopeptidases, such astrypsin or chymotrypsin, the oligopeptides resulting from such treatmentwould represent portions, segments or fragments of the startingpolypeptide. When used in relation to a polynucleotides, such termsrefer to the products produced by treatment of said polynucleotides withany of the common endonucleases.

The term “fragment,” when referring to a coding sequence, means aportion of DNA comprising less than the complete coding region whoseexpression product retains essentially the same biological function oractivity as the expression product of the complete coding region.

The term “open reading frame (ORF)” means a series of triplets codingfor amino acids without any termination codons and is a sequence(potentially) translatable into protein.

As used herein, reference to a DNA sequence includes both singlestranded and double stranded DNA. Thus, the specific sequence, unlessthe context indicates otherwise, refers to the single strand DNA of suchsequence, the duplex of such sequence with its complement (doublestranded DNA) and the complement of such sequence.

In accordance with the present invention, the term “percent identity” or“percent identical,” when referring to a nucleotide or amino acidsequence, means that the sequence is compared to a claimed or describedsequence after alignment of the sequence to be compared (the “ComparedSequence”) with the described or claimed sequence (the “ReferenceSequence”). The Percent Identity is then determined according to thefollowing formula:Percent Identity=100[1−(C/R)]wherein C is the number of differences between the Reference Sequenceand the Compared Sequence over the length of alignment between thesesequences wherein (i) each base or amino acid in the Reference Sequencethat does not have a corresponding aligned base or amino acid in theCompared Sequence and (ii) each gap in the Reference Sequence and (iii)each aligned base or amino acid in the Reference Sequence that isdifferent from an aligned base or amino acid in the Compared Sequence,constitutes a difference; and R is the number of bases or amino acids inthe Reference Sequence over the length of the alignment with theCompared Sequence with any gap created in the Reference Sequence alsobeing counted as a base or amino acid.

If an alignment exists between the Compared Sequence and the ReferenceSequence for which the percent identity as calculated above is aboutequal to or greater than a specified minimum Percent Identity then theCompared Sequence has the specified minimum percent identity to theReference Sequence even though alignments may exist in which thehereinabove calculated Percent Identity is less than the specifiedPercent Identity.

DETAILED DESCRIPTION OF THE INVENTION

Butyrylcholinesterase derived from human serum is a globular, tetramericmolecule with a molecular mass of approximately 340 kDa. Nine Asn-linkedcarbohydrate chains are found on each 574-amino acid subunit (whichsubunit begins with amino acid 17 in SEQ ID NO: 6). The tetrameric formof BChE is stable and has been preferred in the art for therapeuticuses. BChE enzymes produced according to the instant invention have theability to bind and/or hydrolyze organophosphate, such as pesticides,and war gases, succinylcholine, or cocaine.

The BChE enzyme of the present invention comprises an amino acidsequence that is substantially identical to a sequence found in amammalian BChE, more preferably, human BChE, and may be produced as atetramer, a trimer, a dimer, or a monomer. In a preferred embodiment,the synthesized BChE of the invention has a glycosylation and/orsialylation profile that is substantially similar, if not identical, tothat of native human BChE.

The BChE produced according to the present invention is preferably inmonomeric form with high MRT, thus reducing the need for expensivepost-synthetic modification to increase MRT, such as pegylation (i.e.,attachment of one or more molecules of polyethylene glycol of varyingmolecular weight and structure). Conversely, BChE expressedrecombinantly in CHO (Chinese hamster ovary) cells was found not to bemostly in the more stable tetrameric form, but rather consisted ofapproximately 55% dimers, 10-30% tetramers and 15-40% monomers (Blong,et al. Biochem. J., Vol. 327, pp 747-757 (1997)).

Recent studies have shown that a proline-rich amino acid sequence fromthe N-terminus of the collagen-tail protein caused acetylcholinesteraseto assemble into the tetrameric form (Bon, et al. J. Biol. Chem. (1997)272(5):3016-3021 and Krejci, et al. J. Biol. Chem. (1997)272:22840-22847). To greatly increase the amount of monomeric BChEenzyme formed according to the invention, the DNA sequence encoding theBChE enzyme of the invention preferably does not comprise a proline-richattachment domain (PRAD), which otherwise recruits recombinant BChEsubunits (e.g., monomers, dimers and trimers) to form tetramericassociations.

The non-tetrameric forms of BChE are also useful in applications whichdo not require in vivo administration, such as the clean-up of landsused to store organophosphate compounds, as well as decontamination ofmilitary equipment exposed to organophosphates. For ex vivo use, thesenon-tetrameric forms of BChE may be incorporated into sponges, sprays,cleaning solutions or other materials useful for the topical applicationof the enzyme to equipment and personnel. These forms of the enzyme mayalso be applied externally to the skin and clothes of human patients whohave been exposed to organophosphate compounds. The non-tetrameric formsof the enzyme may also find applications as barriers and sealantsapplied to the seams and closures of military clothing and gas masksused in chemical warfare situations.

The present invention also provides vectors that include polynucleotidesof the present invention, host cells which are genetically engineeredwith vectors of the invention and the production of polypeptides of theinvention by recombinant techniques.

Host cells are genetically engineered (i.e., transduced or transformedor transfected) with the vectors of this invention which may be, forexample, a cloning vector or an expression vector, preferably in theform of a plasmid, a viral particle, a phage, etc. The engineered hostcells can be cultured in conventional nutrient media modified asrequired for activating promoters, selecting transformants or amplifyingthe genes of the present invention. The culture conditions, such astemperature, pH and the like, are those previously used with the hostcell selected for expression, and will be apparent to the ordinarilyskilled artisan.

The polynucleotides of the present invention are preferably employed forproducing polypeptides by recombinant techniques. Such a polynucleotidemay be included in any one of a variety of expression vectors forexpressing a polypeptide. Such vectors include chromosomal,nonchromosomal and synthetic DNA sequences, e.g., derivatives of SV40;bacterial plasmids; phage DNA; baculovirus; yeast plasmids; vectorsderived from combinations of plasmids and phage DNA, viral DNA such asvaccinia, adenovirus, fowl pox virus, and pseudorabies. However, anyother vector may be used as long as it is replicable and viable in thehost cell.

The appropriate DNA sequence may be inserted into the vector by avariety of procedures. In general, the DNA sequence is inserted into anappropriate restriction endonuclease site(s) by procedures known in theart. Such procedures and others are deemed to be within the scope ofthose skilled in the art.

The DNA sequence in the expression vector is operatively linked to anappropriate expression control sequence(s) (such as a promoter and/orenhancer, in either cis or trans location) to direct mRNA synthesis. Asrepresentative examples of such promoters, there may be mentioned: LTRor SV40 promoter, the phage lambda P_(L) promoter and other promotersknown to control expression of genes in eukaryotic, preferably human,cells. The expression vector also contains a ribosome binding site fortranslation initiation and a transcription terminator. The vector mayalso include appropriate sequences for amplifying expression, especiallywhere these are designed to work optimally in human cells, for example,Per.C6 cells.

The vector containing the appropriate DNA sequence as herein described,as well as an appropriate promoter or control sequence, may be employedto transform an appropriate host to permit the host to express theprotein.

As representative examples of appropriate hosts, there may be mentionedhuman cells, preferably Per.C6 cells (available from Percivia,Cambridge, Mass.).

More particularly, the present invention also includes recombinantconstructs comprising one or more of the sequences as broadly describedabove. The constructs comprise a vector, such as a plasmid or viralvector, into which a sequence of the invention has been inserted, in aforward or reverse orientation. In a preferred aspect of thisembodiment, the construct further comprises regulatory sequences,including, for example, a promoter, operably linked to the sequence.Large numbers of suitable vectors and promoters are known to those ofskill in the art and are commercially available.

Promoter regions can be selected from any desired gene using CAT(chloramphenicol transferase) vectors or other vectors with selectablemarkers. Two appropriate vectors are pKK232-8 and pCM7. Eukaryoticpromoters include CMV immediate early, HSV thymidine kinase, early andlate SV40, LTRs from retrovirus, and mouse metallothionein-I. Selectionof the appropriate vector and promoter is well within the level ofordinary skill in the art.

In a further embodiment, the present invention relates to host cellscontaining the above-described constructs. The host cell is preferably ahuman cell, since the nucleic acids of the invention have been optimizedfor such human expression. Introduction of the construct into the hostcell can be effected by calcium phosphate transfection, DEAE-Dextranmediated transfection, or electroporation (Davis, L., Dibner, M.,Battey, I., Basic Methods in Molecular Biology, (1986)). The constructsin host cells can be used in a conventional manner to produce the geneproduct encoded by the recombinant sequence. Alternatively, thepolypeptides of the invention can be synthetically produced byconventional peptide synthesizers.

Appropriate cloning and expression vectors for use with eukaryotic hostsare described by Sambrook, et al., Molecular Cloning: A LaboratoryManual, Second Edition, Cold Spring Harbor, N.Y., (1989), the disclosureof which is hereby incorporated by reference.

Transcription of the DNA encoding the polypeptides of the presentinvention by higher eukaryotes, especially human cells, is increased byinserting an enhancer sequence into the vector. Enhancers are cis-actingelements of DNA, usually from about 10 to about 300 by that act on apromoter to increase its transcription. Examples include the SV40enhancer on the late side of the replication origin by 100 to 270, acytomegalovirus early promoter enhancer, the polyoma enhancer on thelate side of the replication origin, and adenovirus enhancers.

Generally, recombinant expression vectors will include origins ofreplication and selectable markers permitting transformation of the hostcell and a promoter derived from a highly-expressed gene to directtranscription of a downstream structural sequence. Such promoters can bederived from operons encoding glycolytic enzymes such as3-phosphoglycerate kinase (PGK), α-factor, acid phosphatase, or heatshock proteins, among others. The heterologous structural sequence isassembled in appropriate phase with translation initiation andtermination sequences, and preferably, a leader sequence capable ofdirecting secretion of translated protein into the periplasmic space orextracellular medium. Optionally, the heterologous sequence can encode afusion protein including an N-terminal identification peptide impartingdesired characteristics, e.g., stabilization or simplified purificationof expressed recombinant product.

Following transformation of a suitable host strain and growth of thehost strain to an appropriate cell density, the selected promoter isinduced by appropriate means (e.g., temperature shift or chemicalinduction) and cells are cultured for an additional period. Cells aretypically harvested by centrifugation, disrupted by physical or chemicalmeans, and the resulting crude extract retained for furtherpurification.

Mammalian, especially human, cell expression vectors will comprise anorigin of replication, a suitable promoter and enhancer, and also anynecessary ribosome binding sites, polyadenylation site, splice donor andacceptor sites, transcriptional termination sequences, and 5′ flankingnon-transcribed sequences. DNA sequences derived from the SV40 splice,and polyadenylation sites may be used to provide the requirednon-transcribed genetic elements.

The polypeptide can be recovered and purified from recombinant cellcultures by methods including ammonium sulfate or ethanol precipitation,acid extraction, anion or cation exchange chromatography,phosphocellulose chromatography, hydrophobic interaction chromatography,affinity chromatography, hydroxylapatite chromatography and lectinchromatography. Protein refolding steps can be used, as necessary, incompleting configuration of the mature protein. Finally, highperformance liquid chromatography (HPLC) can be employed for finalpurification steps.

Because of the degeneracy of the genetic code, more than one codon maybe employed to encode a particular amino acid. However, not all codonsencoding the same amino acid are utilized equally. For optimalexpression in a cell, e.g. a human cell, the nucleic acid, e.g. DNA, tobe expressed may be codon optimized so as to contain a coding regionutilizing the codons most commonly employed by that species or thatparticular type of cell. Codoh optimization is a technique which is nowwell known and used in the design of synthetic genes. Differentorganisms preferentially utilize one or other of these different codons.By optimizing codons, it is possible to greatly increase expressionlevels of the particular protein in a selected cell type.

In accordance with the foregoing, embodiments of an isolated nucleicacid of the invention have been codon-optimized, which codonoptimization is expressed as a Codon Adaptation Index (CAI), whereinsuch CAI for nucleic acids of the invention is at least 0.7, preferablyat least 0.8, more preferably at least 0.9, and most preferably at least0.97. For wild-type (non-optimized human BChE gene), the CAI is at orabout 0.69. Such Codon Adaptation Index is determined according tomethods known in the art by setting the quality value of the mostfrequently used codon for a given amino acid in the desired expressionsystem to 100 with the remaining codons scaled accordingly (see Sharpand Li, Nucleic Acids Research, Vol. 15(3), 99. 1281-1295 (1987)). TheCAI uses a reference set of highly expressed genes from a species todetermine relative merit of each codon to calculate a score for a genefrom the frequency of use of all codons in said gene. This index isuseful for predicting the level of expression of a gene and indicatelikely success of heterologous gene expression in a given cell system.

In some embodiments, the nucleic acids, for example, a DNA, of thepresent invention have also been optimized using additional parameters.For example, analysis of the wild-type human BChE gene has been found tocontain an average G+C content of about 40% by determining the GCcontent in a 40 bp window centered about various nucleotide positions.Conversely, the GC content of nucleic acids according to the presentinvention, such as the codon optimized nucleic acids disclosed herein,have an average GC content of about 60%. For example, in producingnucleic acids of the present invention, very high or low GC content hasbeen avoided, so that any GC content less than 30% or above 80% has beenavoided.

The present invention relates to an isolated nucleic acid, DNA or RNA,that encodes a polypeptide having BChE enzyme activity (for example,using the well known Ellman assay—Ellman, G. L., et al, Biochem.Pharmacol., Vol. 7, pp. 88-95 (1961)), wherein the percentage of guanineplus cytosine (G+C) nucleotides in the coding region of said nucleicacid is greater than 40%, or is greater than 45%, or is greater than50%, or is greater than 55%, or is at least 60%, or is greater than 60%but is less than 80%. In all cases, codon usage has been adapted to thecodon bias of human (Homo sapiens) genes.

Specific embodiments of an isolated nucleic acid of the inventioninclude nucleic acids comprising a nucleotide sequence having at least80% identity, or at least 90% identity, preferably at least 95%identity, more preferably at least 98% identity to the nucleotidesequence of SEQ ID NO: 1 or the complement thereof. In a preferredembodiment, the nucleic acid of the invention comprises, more preferablyconsists essentially of, and most preferably consists of, the nucleotidesequence of SEQ ID NO: 1 or the complement thereof. In all cases, suchnucleic acids, in addition to SEQ ID NO: 1 itself, meet the otherrequirements of the invention regarding GC content and/or CAI and likeparameters.

In one embodiment, the BChE polypeptide encoded by the nucleic acid ofthe invention comprises, preferably consists of, amino acids 22-564 ofSEQ ID NO: 2. In some embodiments, the encoded BChE polypeptide containsfewer than the number of amino acids in SEQ ID NO: 2 (i.e., is atruncated, variant or modified form of said sequence, such as thetruncate shown as SEQ ID NO: 4 and/or in FIG. 3, with or without thesignal sequence). Such modification may affect the overall 3-dimensionalstructure or shape of the resulting BChE enzyme. Consequently, theresulting encoded polypeptide does not readily form tetramers or evendimers, but remains in a monomeric state.

In accordance with the invention, such monomeric BChE polypeptides areachieved by producing a BChE polypeptide that differs from the nativeBChE of SEQ ID NO: 6, or is a variant of such polypeptide, such as apolypeptide having less than 100% identity to the sequence of SEQ ID NO:2 or 4 or amino acids 22-564 of said sequences but retainingsubstantially all of the BChE enzyme activity of said polypeptide, orwhere one or more amino acids present in such polypeptides are eitherdifferent or not included in said polypeptide, which is thus a variantor shortened or truncated polypeptide and which forms mostly, or all, oronly, monomers.

For example, such a truncate commonly differs from native, or fulllength, BChE (such as SEQ ID NO: 6 starting at amino acid 17) in that aspecific domain is not present. In a preferred embodiment, such a domain(referred to in the art as the WAT domain) comprises the C-terminalportion of the BChE polypeptide, preferably the C-terminal 20 to 40amino acids of a native BChE, more preferably the C-terminal 25 to 35amino acids of native BChE, most preferably the C-terminal 31 aminoacids of such BChE. To produce a truncate within the invention, anyamino acid segment after the tryptophan residue at residue 564 of SEQ IDNO: 2 (FIG. 1) may be deleted from the sequence of the mature protein.In one embodiment, said BChE polypeptide does not form monomers becauseall or a portion of said WAT domain has been deleted or because one ormore amino acid substitutions have been introduced into said domain torender it non-functional for the purpose of inducing multimer formationof the resulting mature polypeptide.

For expression from DNA, this is accomplished by insertion of atermination codon following the codon encoding such tryptophan, eitherimmediately following it or following a codon. 3′ of said tryptophancodon so as to subsequently shorten the resulting encoded amino acidsequence of the BChE protein. Where the latter is to be synthesized bydirect chemical synthesis, the sequence from the N-terminus of SEQ IDNO: 2 up to or exceeding the tryptophan at residue 564 is included inthe synthetic product but not some, most or all of the amino acidsC-terminal of said tryptophan to form a truncate. BChE truncates of thepresent invention may or may not include a signal sequence, such as thatshown in FIG. 1, 2 or 3. Thus, a truncate of the mature polypeptide iscontemplated by the invention. One such example would consist of aminoacids 22-564 of SEQ ID NO: 4.

Such a monomer, especially one composed of such a truncate, has theadvantage of a much better defined amino acid sequence and is capable ofbeing synthesized in amounts up to 10 times, 20 times or even 40 times,or more, the amounts normally synthesized of the tetrameric form of BChEand at greatly reduced cost, thereby making it a much more desirabletherapeutic agent from both a clinical and commercial viewpoint.

The sequence KAGFHRWNNYMMDWKNQFNDYTSKKESCVGL (SEQ ID NO: 7), located atamino acid residues 565-595 of SEQ ID NO: 2, has been found to beinvolved in formation of multimeric BChE molecules, such as in thedimerization and/or tetramerization of BChE (see, for example, Blong etal., supra, which contains additional structural information concerningsuch domains). By the deletion of part or all of this domain, most, ifnot all, of the resulting BChE product is rendered not capable offorming multimers and so remains in monomic form. In forming such atruncate of the invention, only so much of this domain need be removedto prevent dimer or tetramer formation from the synthesized monomer. Inone embodiment, such as SEQ ID NO: 4 (FIG. 3), all of it is missing. Forsuch a product, the expression of the monomer is higher, thepurification and characterization easier, and the cost of the productsubstantially lower.

In one preferred embodiment, this truncated form of BChE is expressed inPer.C6 cells with the optimal clone selected. This truncated BChE has along MRT, making it the preferable form as a drug product. In otherembodiments, such truncate is also prepared by direct synthesis andother means, such as using recombinant cells, preferably mammaliancells, more preferably human cells, most preferably Per.C6 (or PerC6)cells, that achieve high levels of glycosylation and/or sialylation of aheterologous protein, or where synthesis, either in vitro or in vivo, isfollowed by in vitro glycosylation and/or sialylation.

In a preferred embodiment, a BChE truncate of the invention comprisesthe amino acid sequence of SEQ ID NO: 4 (shown in FIG. 3), whichcomprises a human signal sequence at the N-terminus.

A BChE truncate of the present invention is thus a BChE molecule withpart or all of the WAT domain removed. Without a functioning WAT domain,the molecule does not form multimers, such as tetramers and/or dimers,but forms mostly, if not only, monomers. The selection of the truncationsite, for example, after W at 564 (of SEQ ID NO: 2, for example)facilitates a more uniform C-terminal region. Use of Per.C6 cellscoupled with the selection of high glycosylation and high sialylationclone(s) ensures long serum half-life and is a preferred embodiment ofthe invention. In sum, such a truncated BChE construct is simple,results in higher yield, longer serum half-life and a more homogeneousproduct, meeting all the requirements needed for an effectivepharmaceutical agent.

The BChE truncate of the present invention can be produced by any meansknown in the art, including by direct chemical synthesis and thesequence of such truncate, where prepared by expression of an encodingDNA, need not be derived from a codon-optimized DNA. Standard referencesare available that contain procedures well known in the art of molecularbiology and genetic engineering for producing the nucleic acids andpolypeptides of the present invention. Useful references includeSambrook, et al., Molecular Cloning: A Laboratory Manual, SecondEdition, Cold Spring Harbor, N.Y., (1989), Wu et al, Methods in GeneBiotechnology (CRC Press, New York, N.Y., 1997), and Recombinant GeneExpression Protocols, in Methods in Molecular Biology, Vol. 62, (Tuan,ed., Humana Press, Totowa, N.J., 1997), the disclosures of which arehereby incorporated by reference.

In another aspect, the present invention relates to a vector comprisinga nucleic acid of the invention, as well as to a recombinant cellcontaining such a vector and expressing a BChE polypeptide by expressinga nucleic acid of the invention, preferably where that nucleic acid ispresent in a vector of the invention. The present invention also relatesto a method of preparing such a polypeptide having BChE enzyme activity,comprising expressing the polypeptide from a recombinant cell asdescribed herein, preferably where the polypeptide comprises the aminoacid sequence of SEQ ID NO: 4 (for example, SEQ ID NO: 2), includingwhere the polypeptide forms only monomers, for example, where suchpolypeptide does not contain portions of one or more domains thatpromote formation of such supra-molecular structures, including wherethe entire domain is absent or sequence altered.

As noted above, monomers were thought to be less active that tetramersbut this is likely due to improper glycosylation. The present inventionprovides a codon optimized nucleic acid encoding BChE and appropriateglycosylation sites coupled with a cell line especially useful forexpressing the properly glycosylated truncated molecule as a fullyactive monomer monomer possessing good retention time.

A nucleic acid of the invention used to transform the cells useful inpracticing the invention is a synthetic gene of which the nucleic acidof SEQ ID NO 1 is only a specific yet preferred example. Such nucleicacids include modified forms of SEQ ID NO: 1. The expression “modifiedform” refers to other nucleic acid sequences which encode BChE(including fragments or variants thereof) and have BChE enzyme activitybut which utilize different codons, provided the requirement for thepercentage GC content and other criteria recited in accordance with theinvention are met. Suitable modified forms include those that compriseat least 80% identity, preferably at least 90% identity, more preferablyat least 95% identity and even more preferably at least 98% identity toSEQ ID NO 1.

In one embodiment, the nucleic acid sequence of SEQ ID NO: 1 and/or SEQID NO: 3 have been codon-optimized for expression in human cells, suchas Per.C6 cells (a fully characterized human cell line for use withrecombinant adenoviral vectors (available from Crucell L.V., Leiden, TheNetherlands). This cell line has the advantage of producing heterologousproteins in good yield. It has also been found to be free of prions, iseasily transfected with exogenous genes and grows well in commerciallyavailable media free of animal and/or human derived proteins.

In addition to codon optimization, the different embodiments of theinvention have been achieved by further optimizing the nucleic acids ofthe invention to avoid inclusion of polynucleotide sequence elementsthat would otherwise reduce expression of the nucleic acid, andsubsequent synthesis of BChE. In particular said sequence elements maybe selected from the group comprising; negative elements or repeatsequences, cis-acting motifs such as splice sites, internal TATA-boxesand ribosomal entry sites.

A TATA box (or TATA) site is well known in the art and generallyrepresents a consensus sequence found in the promoter region of genesthat are transcribed by the RNA polymerase II found in mammalian, suchas human, cells. It is often located about 25 nucleotides upstream ofthe transcription start site (often having the sequence 5′ TATAAAA 3′(SEQ ID NO: 11)). It is relevant in determining the initiation site forgene transcription. However, when such a site is present internallywithin the coding region of a gene it can adversely affect (i.e., slow)gene expression and is thus to be avoided where efficient high levelexpression is sought. Where possible, such sequences have been avoidedin the nucleic acids of the present invention.

Gene expression is also slowed by other structural motifs found incoding regions of genes. One such motif is the chi-site, which caninduce homologous recombination, thereby disrupting the cloned gene. Forexample, the enzyme RecBCD (a heterotrimeric helicase that initiateshomologous recombination at double-stranded DNA breaks) can be modulatedby the DNA sequence denoted “chi” (i.e., 5′-GCTGGTGG-3′ (SEQ ID NO: 8)).Such chi-sites have been avoided, where possible, in achieving thenucleic acids of the present invention, which preferably neither containnor encode such chi sites.

In eukaryotes, the Kozak sequences 5′-ACCACCAUGG-3′ (SEQ ID NO: 9) or5′-GCCACCAUGG-3′ (SEQ ID NO: 10), which lie within a short 5′untranslated region, direct translation of mRNA and are thus upstream ofthe transcription start site (the AUG codon that begins transcription).These sequences are effectively recognized by the ribosome as atranslation start site and are different from the internal ribosomalbinding site (RBS), which includes an internal ribosomal entry site(IRES) or the 5′ cap of the mRNA molecule. The strength of the Kozaksequence can determine the extent of translation of the mRNA and thusthe amount of protein produced. Internal ribosomal entry sites have beenavoided, where feasible, in achieving the nucleic acids of theinvention. However, the nucleic acids, or genetic constructs, of theinvention, for purposes of expression from recombinant cells of theinvention, preferably do encode a Kozak site upstream of the start site.

Protein-coding genes of mammals may also contain introns that ateinvolved in RNA splicing events that take place after transcription iscomplete but prior to translation at the ribosome. Sequences of suchsites are known in the art and have, where feasible, been avoided indesigning the sequences of the nucleic acids of the invention, which aremade up mostly of coding sequence and are thus cDNA in nature. Forexample, such a splice site may contain a an almost invariant GTsequence at the 5′ end of the intron as part of a larger less conservedregion. The 3′ splice site or splice acceptor site terminates the intronwith an almost always present AG sequence. Upstream of this AG site isoften found a sequence in pyrimidine content (i.e., C and Tnucleotides). Such structural motifs are well known in the art and,where feasible, have been likewise avoided in achieving the nucleicacids, or DNA constructs, of the present invention.

Recombinant butyrylcholinesterase forms often exhibit variation in thetype of sugar residues found within the different sugars attached to themolecule. Such variation can negatively affect the mean retention time(MRT) of the BChE molecule in vivo. Among the factors that can determinesuch variability are the number and arrangement of non-sialylatedgalactose and mannose residues as well as the host cell used to producethe glycosylated final product in BChE expression, since differentexpression systems may glycosylate the BChE molecule differently.Processes of in vitro glycosylation after synthesis have been attemptedby those in the art to avoid such problems. For example, it has beenshown that the stability of BChE is affected by capping of the terminalcarbohydrate residues with sialic acid since uncapped galactose residuesbind to receptors on hepatocytes and thereby clear the protein from thesystem. The present invention preferably utilizes the Per.C6 expressionsystem so as to achieve as close a similarity to the native humanglycosylation patent for BChE as is possible.

What is claimed is:
 1. An isolated nucleic acid that encodes apolypeptide having butyrylcholinesterase (BChE) enzyme activity, whereinthe percentage of guanine plus cytosine (G+C) nucleotides in the codingregion of said nucleic acid is greater than 40% but not greater than80%, wherein said nucleic acid has at least 90% identity to thenucleotide sequence of SEQ ID NO: 1 and wherein the encoded polypeptidedoes not comprise a functional WAT domain.
 2. The isolated nucleic acidof claim 1, wherein the percentage of guanine plus cytosine (G+C)nucleotides in the coding region of said nucleic acid is greater than45% but not greater than 80%.
 3. The isolated nucleic acid of claim 1,wherein the percentage of guanine plus cytosine (G+C) nucleotides in thecoding region of said nucleic acid is greater than 50% but not greaterthan 80%.
 4. The isolated nucleic acid of claim 1, wherein thepercentage of guanine plus cytosine (G+C) nucleotides in the codingregion of said nucleic acid is greater than 55% but not greater than80%.
 5. The isolated nucleic acid of claim 1, wherein the percentage ofguanine plus cytosine (G+C) nucleotides in the coding region of saidnucleic acid is greater than 60% but not greater than 80%.
 6. Theisolated nucleic acid of claim 1, wherein the percentage of guanine pluscytosine (G+C) nucleotides in the coding region of said nucleic acid isat least 60% but not greater than 80%.
 7. The isolated nucleic acid ofclaim 1, wherein said nucleic acid does not contain or encode aninternal TATA-box.
 8. The isolated nucleic acid of claim 1, wherein saidnucleic acid encodes one or more sialylation sites on said polypeptide.9. The isolated nucleic acid of claim 1, wherein said nucleic acid doesnot contain or encode an internal ribosomal entry site.
 10. The isolatednucleic acid of claim 1, wherein said nucleic acid does not contain orencode a splice donor or acceptor site.
 11. The isolated nucleic acid ofclaim 1, wherein said nucleic acid contains or encodes at least oneKozak sequence upstream of the start site.
 12. The isolated nucleic acidof claim 1, wherein the sequence of said nucleic acid has a CodonAdaptation Index (CAI) of at least 0.7.
 13. The isolated nucleic acid ofclaim 1, wherein said nucleic acid has a CAI of at least 0.8.
 14. Theisolated nucleic acid of claim 1, wherein said nucleic acid has a CAI ofat least 0.9.
 15. The isolated nucleic acid of claim 1, wherein saidnucleic acid has a CAI of at least 0.97.
 16. The isolated nucleic acidof claim 1, wherein said percent identity is at least 95%.
 17. Theisolated nucleic acid of claim 1, wherein said percent identity is atleast 98%.
 18. The isolated nucleic acid of claim 1, wherein saidnucleic acid is SEQ ID NO:
 3. 19. The isolated nucleic acid of claim 1,wherein said BChE polypeptide consists of amino acids 22 to 564 of SEQID NO:
 2. 20. The isolated nucleic acid of claim 1, wherein said BChEpolypeptide consists of amino acids 1 to 564 of SEQ ID NO:
 2. 21. Theisolated nucleic acid of claim 1, wherein said encoded polypeptidecontains one or more amino acid substitutions that render the WAT domainnon-functional for the purpose of inducing multimer formation of theencoded polypeptide.
 22. The isolated nucleic acid of claim 1, whereinsaid encoded polypeptide is missing all or a portion of the WAT domain.23. An isolated fragment of the nucleic acid of claim 1, wherein saidfragment encodes a polypeptide having BChE enzyme activity.
 24. A vectorcomprising the nucleic acid of claim
 1. 25. A recombinant cellcontaining the vector of claim
 24. 26. A method of preparing apolypeptide having BChE enzyme activity, comprising expressing saidpolypeptide from the cell of claim
 25. 27. The recombinant cell of claim25, wherein said cell is a mammalian cell.
 28. The recombinant cell ofclaim 25, wherein said cell is a human cell.
 29. The recombinant cell ofclaim 25, wherein said cell is a Per.C6 cell.
 30. The method of claim26, wherein said polypeptide forms only monomers.
 31. The method ofclaim 26, wherein said polypeptide does not contain all or a portion ofthe WAT domain.
 32. The method of claim 31, wherein said polypeptidedoes not contain the WAT domain.
 33. The method of claim 26, whereinsaid polypeptide does not contain all or a portion of the amino acidsequence of SEQ ID NO:
 7. 34. The method of claim 26, wherein saidpolypeptide consists of amino acids 1 to 564 of SEQ ID NO:
 2. 35. Themethod of claim 26, wherein said polypeptide consists of amino acids22-564 of SEQ ID NO:
 2. 36. The isolated nucleic acid of claim 1,wherein said nucleic acid is a DNA or the complement thereof.
 37. Theisolated nucleic acid of claim 36, wherein said DNA is a cDNA or thecomplement thereof.
 38. The isolated nucleic acid of claim 1, whereinsaid nucleic acid isan RNA.